Section II: Data analysis

1. Overview of cases

  1. Cases in (Europe) between (2020-07-01) and (2022-03-01); random sample (30000) out of (112900) genomes
          a. Cases (see Figure 1.1.A)
          b. Lineages (see Figure 1.1.B)

2. Epidemiological cluster cohesion (ECC) analysis

  1. ECC histogram and contour plots
          a. Geospatial and temporal ECC histogram (TP1 and TP2) (see Figure 2.1.A)
          b. Geospatial and temporal ECC 3D surface density (see Figure 2.1.B)
  2. ECC Change vector directionality analysis
          a. Bubble plots with directionality vectors (see Figure 2.2.A)
          b. Summary of directionality vectors by TP1 cluster size (see Figure 2.2.B)
          c. Summary of directionality vectors by Cluster growth rate (see Figure 2.2.C)
          d. Summary of directionality vectors by ECC and Delta ECC level (see Figure 2.2.D)
  3. Cluster Growth and Centroids
          a. Top 20 growth clusters (by size) (see Figure 2.3.A)
          b. Top 20 growth clusters (by growth rate) (see Figure 2.3.B)
          c. Average geographic and temporal pairwise distance for top 20 clusters (by size) (see Figure 2.3.C)
          d. Average geographic and temporal pairwise distance for top 20 clusters (by growth rate) (see Figure 2.3.D)
  4. Monthly ECC trend analysis
          a. Monthly Geo and Temp ECC (average) (see Figure 2.4.A)
          b. Top 10 clusters (by size) Geo ECC, Geo-Temp ECC,Temp ECC (see Figure 2.4.B)
          c. Top 10 clusters (by growth rate) Geo ECC,Geo-Temp ECC,Temp ECC (see Figure 2.4.C)
  5. Weekly ECC trend analysis
          a. Weekly Geo and Temp ECC (average) (see Figure 2.5.A)
          b. Top 10 clusters (by size) Geo ECC, Geo-Temp ECC, Geo ECC,Temp ECC (see Figure 2.5.B)
          c. Top 10 clusters (by growth rate) Geo ECC, Geo-Temp ECC,Temp ECC (see Figure 2.5.C)

3. Geospatial and temporal heterogeneity within top clusters

  1. Heatmaps and frequency histograms for Top 5 clusters (by size)
          a. GEO + 50-50 + TEMP heatmaps for top 5 cluster (see Figure 3.1.A)
          b. GEO + 50-50 + TEMP heatmaps for top 4 cluster (see Figure 3.1.B)
          c. GEO + 50-50 + TEMP heatmaps for top 3 cluster (see Figure 3.1.C)
          d. GEO + 50-50 + TEMP heatmaps for top 2 cluster (see Figure 3.1.D)
          e. GEO + 50-50 + TEMP heatmaps for top 1 cluster (see Figure 3.1.E)
          f. GEO + 50-50 + TEMP frequency histograms for top 5 cluster (see Figure 3.1.F)
          g. GEO + 50-50 + TEMP frequency histograms for top 4 cluster (see Figure 3.1.G)
          h. GEO + 50-50 + TEMP frequency histograms for top 3 cluster (see Figure 3.1.H)
          i. GEO + 50-50 + TEMP frequency histograms for top 2 cluster (see Figure 3.1.I)
          j. GEO + 50-50 + TEMP frequency histograms for top 1 cluster (see Figure 3.1.J)
  2. Heatmaps and frequency histogram for Top 5 clusters (by growth rate)
          a. GEO + 50-50 + TEMP heatmaps for top 5 cluster (see Figure 3.2.A)
          b. GEO + 50-50 + TEMP heatmaps for top 4 cluster (see Figure 3.2.B)
          c. GEO + 50-50 + TEMP heatmaps for top 3 cluster (see Figure 3.2.C)
          d. GEO + 50-50 + TEMP heatmaps for top 2 cluster ( see Figure 3.2.D)
          e. GEO + 50-50 + TEMP heatmaps for top 1 cluster (see Figure 3.2.E)
          f. GEO + 50-50 + TEMP frequency histograms for top 5 cluster (see Figure 3.2.F)
          g. GEO + 50-50 + TEMP frequency histograms for top 4 cluster (see Figure 3.2.G)
          h. GEO + 50-50 + TEMP frequency histograms for top 3 cluster (see Figure 3.2.H)
          i. GEO + 50-50 + TEMP frequency histograms for top 2 cluster (see Figure 3.2.I)
          j. GEO + 50-50 + TEMP frequency histograms for top 1 cluster (see Figure 3.2.J)
  3. Pairwise distance for Top 20 clusters (by size)
          a. Pairwise geospatial distance for top20 clusters (by size) (see Figure 3.3.A)
          b. Pairwise temporal distance for top20 clusters (by size) (see Figure 3.3.B)

4. Data summary tables

  1. Summary of SARS-Cov2 genomes by geography and time
          a. by country and TP/month by time (week, month, TP) (see Table 4.1.a)
          b. by country and week (see Table 4.1.b)
  2. Summary of SARS-Cov2 genomes by PANGO lineages
          a. by PANGO lineages and country (see Table 4.2.a)
          b. by PANGO lineages and Timepoint/Month (see Table 4.2.b)
          c. by PANGO lineages and week(see Table 4.2.c)
          d. Summary table for change vector directionality (see Table 4.2.d)

5. Supplementary figures

  1. Genome counts over time
          a. By Country
                i. TP1 vs TP2 genome counts (see Figure 5.1.A.I)
                ii. Cumulative genome counts by month, faceted by country (see Figure 5.1.A.II)
                iii. Genome counts, faceted by month (see Figure 5.1.A.III)
                iv. Cumulative genome counts by week, faceted by country (see Figure 5.1.A.IV)
                v. Genome counts, faceted by week (see Figure 5.1.A.V)
          b. By Province
                i. TP1 vs TP2 genome counts (see Figure 5.1.B.I)
                ii. Cumulative genome counts by month, faceted by province (see Figure 5.1.B.II)
                iii. Genome counts, faceted by month (see Figure 5.1.B.III)
                iv. Cumulative genome counts by week, faceted by province (see Figure 5.1.B.IV)
                v. Genome counts, faceted by week (see Figure 5.1.B.V)
  2. Lineage prevalence over time
          a. By Month
                i. Cumulative genome counts for most prevalent lineages, by country (see Figure 5.2.A.I)
          b.By Week
                ii. Cumulative genome counts for most prevalent lineages, by country (see Figure 5.2.B.I)
  3. Lineage spread by geographical area
          a. By Country
                i. TP1 vs TP2 of countries detected (see Figure 5.3.A.I)
                ii. Countries detected: most prevalent lineages, by month (see Figure 5.3.A.II)
                iii. Countries detected: most prevalent lineages, by week (see Figure 5.3.A.III)
          b. By Province
                i. TP1 vs TP2 of Provinces detected:(see Figure 5.3.B.I)
                ii. Provinces detected: most prevalent lineages, by month (see Figure 5.3.B.II)
                iii. Provinces detected: most prevalent lineages, by week (see Figure 5.3.B.III)
  4. ECC direction classes vs compass rose (see Figure 5.4)

1. Overview of cases

  1.1. Cases and lineages

2. Epidemiological cluster cohesion (ECC) analysis

  2.1. ECC histogram and contour plots
  2.2. ECC Change vector directionality analysis
  2.3. Cluster Growth and Centroids
  2.4. Monthly ECC trend analysis
  2.5. Week ly ECC trend analysis

3. Geospatial and temporal heterogeneity within top clusters

  3.1. Heatmaps and pair-wise distance frequency histogram for top 5 clusters (by size)
  3.2. Heatmaps and pair-wise distance frequency histogram for top 5 clusters (by growth rate)
  3.3. Pairwise distance for Top 20 clusters (by size (by growth rate)

4. Data summary tables

  4.1. Summary of SARS-Cov2 genomes by geography and time
  4.2. Summary of SARS-Cov2 genomes by PANGO lineages

5. Supplementary figures

  5.1.Genome counts over time ECC
   a.By Country
   b.By Provice
  5.2. Lineage prevalence over tim
   a.By Month
   b.By Week
  5.3.Lineage spread by geographical area
   a.By Country
   b.By province